-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
align from-less query semantics with SQL #5418
Conversation
This commit changes the interpretation of a query that omits the from operator. Previously, a lake query would use the default "HEAD" parameter as an implied "from HEAD". This contradicts SQL which assumes a table with one empty row (like the implied yield null case). The new assumption in a lake query for a from-less query is an implied null source so that a "select 'hello'" query will return not a single 'hello' rather than a sequence based on the implied read from HEAD. Similary, a from-less command-line query (i.e., super -c) with no file arguments also has an automatic null scan inserted. As part of this change, we moved the logic for inserting the null scan into the DAG (along with the logic for inserting DefaultScan of zio.Readers) into the semantic pass. These changes mean that lake queries that read from a pool require an explicit "from" and there is no impleied pool from HEAD. Because of this, we removed the HEAD references in the code paths that involve lake queries. HEAD is still used by the various "super db" commands to refer to a default pool/branch/commit. Currently, the null scan does not follow the done protocol and restart. We will need to add this when we want from-less subqueries to restart.
compiler/parser/api.go
Outdated
return errors.New("internal error: AST seq cannot be empty") | ||
return errors.New("delete where command requires an expression") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The parser shouldn't know about the "delete where" command so I'd leave this error string as is.
compiler/parser/api.go
Outdated
@@ -39,6 +43,9 @@ func ParseQuery(query string, filenames ...string) (*AST, error) { | |||
if err != nil { | |||
return nil, err | |||
} | |||
if files.Text == "" { | |||
return &AST{seq: []ast.Op{}, files: files}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
return &AST{seq: []ast.Op{}, files: files}, nil | |
return &AST{files: files}, nil |
compiler/semantic/analyzer.go
Outdated
} else if extInput { | ||
seq.Prepend(&dag.DefaultScan{Kind: "DefaultScan"}) | ||
} else { | ||
// This is a local query and there's not external input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This is a local query and there's not external input | |
// This is a local query and there's no external input |
This commit changes the interpretation of a query that omits the from operator. Previously, a lake query would use the default "HEAD" parameter as an implied "from HEAD". This contradicts SQL which assumes a table with one empty row (like the implied yield null case).
The new assumption in a lake query for a from-less query is an implied null source so that a "select 'hello'" query will return a single 'hello' rather than a sequence based on the implied read from HEAD.
Similary, a from-less command-line query (i.e., super -c) with no file arguments also has an automatic null scan inserted.
As part of this change, we moved the logic for inserting the null scan (along with the logic for inserting DefaultScan of zio.Readers) into the semantic pass (instead of by lake/file compiler).
These changes mean that lake queries that read from a pool require an explicit "from" and there is no implied pool from HEAD. Because of this, we removed the HEAD references in the code paths that involve lake queries. HEAD is still used by the various "super db" commands to refer to a default pool/branch/commit.
Currently, the null scan does not follow the done protocol and restart. We will need to add this when we want from-less subqueries to restart.